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Abstract 

We establish a relationship between the online mistake-bound model of learning and resource- 
bounded dimension. This connection is combined with the Winnow algorithm to obtain new 
results about the density of hard sets under adaptive reductions. This improves previous work 
of Fu (1995) and Lutz and Zhao (2000), and solves one of Lutz and Mayordomo’s “Twelve 
Problems in Resource-Bounded Measure” (1999). 


1 Introduction 

This paper has two main contributions: (i) establishing a close relationship between resource- 
bounded dimension and Littlestone’s online mistake-bound model of learning, and (ii) using this 
relationship along with the Winnow algorithm to resolve an open problem in computational com¬ 
plexity. In this introduction we briefly describe these contributions. 

1.1 Online Learning and Dimension 

Lindner, Schuler, and Watanabe m studied connections between computational learning the¬ 
ory and resource-bounded measure, primarily working with the probably approximately correct 
(PAC) model. They also included the observation that any “admissible” subclass of P/poly that 
is polynomial-time learnable in Angluin’s exact learning model [2] must have p-measure 0. The 
proof of this made use of the essential equivalence between Angluin’s model and Littlestone’s online 
mistake-bound model m- 

In the online mistake-bound model, a learner is presented a sequence of examples, and is asked 
to predict whether or not they belong to some unknown target concept. The concept is drawn from 
some concept class, which is known to the learner, and the examples may be chosen by an adversary. 
After making its prediction about each example, the learner is told the correct classification for the 
example, and learner may use this knowledge in making future predictions. The mistake bound of 
the learner is the maximum number of incorrect predictions the learner will make, over any choice 
of target concept and sequence of examples. 

We push the observation of m much further, developing a powerful, general framework for 
showing that classes have resource-bounded dimension 0. Resource-bounded measure and dimen¬ 
sion involve betting on the membership of strings in an unknown set. To prove that a class has 
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dimension 0, we show that it suffices to give a reduction to a family of concept classes that has a 
good mistake-bound learning algorithm. It is possible that the reduction can take exponential-time 
and that the learning algorithm can also take exponential-time, as long as the mistake bound of the 
algorithm is subexponential. If we have a reduction from the unknown set to a concept in learnable 
concept class, we can view the reduction as generating a sequence of examples, apply the learning 
algorithm to these examples, and use the learning algorithm’s predictions to design a good betting 
strategy. Formal details of this framework are given in Section |21 

1.2 Density of Hard Sets 

The two most common notions of polynomial-time reductions are many-one (<m) and Turing 
A many-one reduction from Ato B maps instances of A to instance of B, preserving membership. 
A Turing reduction from A to B makes many, possibly adaptive, queries to B in order to solve 
A. Many-one reductions are a special case of Turing reductions. In between <m and <!^ is a wide 
variety of polynomial-time reductions of different strengths. 

A common use of reductions is to demonstrate hardness for a complexity class. Let <P be a 
polynomial-time reducibility. For any set B, let Pr(l^) = {A \ A <P B} be the class of all problems 
that <P-reduce to B. We say that B is <r-hard for a complexity class C if C C PT-(i?), that is, every 
problem in C <r-reduces to B. For a class V of sets, a useful notation is VriT>) = UseD ^t{B). 

A problem B is dense if there exists e > 0 such that |i?<n| > 2** for all but finitely many n. 
All known hard sets for the exponential-time complexity classes E = DTIME(2‘^("')) or EXP = 
DTIME(2"' ) are dense. Whether every hard set must be dense has been often studied. Pirst, 
Meyer PSj showed that every <m-hard set for E must be dense, and he observed that proving the 
same for <J^-reductions would imply that E has exponential circuit-size complexity. Since then, a 
line of research has obtained results for a variety of reductions between <m and <!^, specifically 
the conjunctive (<c) and disjunctive (<d) reductions, and for various functions /(n), the bounded 
query and reductions: 

1. Watanabe nzidni showed that every hard set for E under the <c, <j, or reduc¬ 

tions is dense. 

2. Lutz and Mayordomo m showed that for all a < 1, the class P,ia_tt(DENSE'^) has p-measure 
0, where DENSE is the class of all dense sets. Since E does not have p-measure 0, their result 
implies that every <J^Q_j.^-hard set for E is dense. 

3. Eu |H] showed that for all a < 1/2, every <J^a_rp-hard set for E is dense, and that for all 
a < I, every <J^a_rp-hard set for EXP is dense. 

4. Lutz and Zhao m gave a measure-theoretic strengthening of Pu’s results, showing that for 
all a < 1/2, P„a_'p(DENSE'^) has p-measure 0, and that for all a < 1, P„a_x(DENSE'^) has 
P 2 -measure 0. 

This contrast between E and EXP in the last two references was left as a curious open problem, 
and exposited by Lutz and Mayordomo EH as one of their “Twelve Problems in Resource-Bounded 
Measure”: 

Problem 6. For a < \ < 1, is it the case that P„a_T(DENSE'^) has p-measure 0 (or 
at least, that E % Pn,a_'p(SPARSE)/? 
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We resolve this problem, showing the much stronger conclusion that the classes in question have 
p-dimension 0. But first, in Section 1^ we prove a theorem about disjunctive reductions that 
illustrates the basic idea of our technique. We show that the class Pd(DENSE'^) has p-dimension 
0. The proof uses the learning framework of Section El and Littlestone’s Winnow algorithm HE]. 
Suppose that A S, where S is a nondense set. Then there is a reduction g mapping strings to 
sets of strings such that x G A if and only if at least one string in g(x) belongs to S. We view the 
reduction g as generating examples that we can use to learn a disjunction based on S. Because 
S is subexponentially dense, the target disjunction involves a subexponential number of variables 
out of exponentially many variables. This is truly a case “when irrelevant attributes abound” m 
and the Winnow algorithm perfoms exceedingly well to establish our dimension result. In the same 
section we also use the learning framework to show that Pc(DENSE'^) has p-dimension 0. These 
results give new proofs of Watanabe’s aforementioned theorems about <^-hard and <c-hard sets 
for E. 

Our main theorem, presented in Section El is that for all a < 1, P„a_T(DENSE‘^) has p- 
dimension 0. This substantially improves the results of [SnilHlESj. The resource-bounded measure 
proofs in EEl ED use the concept of weak stochasticity. As observed by Mayordomo Eli, this 
stochasticity approach can be extended to show a — 1 ®*-order scaled dimension m result, but it 
seems a different technique is needed for an (unsealed) dimension result. Our learning framework 
turns out to be just what is needed. We reduce the class P„a_'p(DENSE'^) to a family of learnable 
disjunctions. Eor this, we make use of a technique that Allender, Hemaspaandra, Ogiwara, and 
Watanabe P used to prove a surprising result converting bounded-query reductions to sparse sets 
into disjunctive reductions to sparse sets: Pbtt(SPARSE) C Pd(SPARSE). Carefully applying the 
same technique on a sublinear-query Turing-reduction to a nondense set results in a disjunction 
with a nearly exponential blowup, but it can still be learned by Winnow in our dimension setting. 

The density of complete and hard sets for NP has also been studied often, with motivation 
coming originally from the Berman-Hartmanis isomorphism conjecture |Sj: all many-one complete 
sets are dense if the isomorphism conjecture holds. Since no absolute results about the density 
of NP-complete or NP-hard sets can be proved without separating P from NP, the approach has 
been to prove conditional results under a hypothesis on NP. Mahaney m showed that if P 7 ^ NP, 
then no sparse set is <m-hard for NP. Ogiwara and Watanabe m extended Mahaney’s theorem 
to the <j(j.^-hard sets. Deriving a result from P 7 ^ NP about NP-hard sets under unbounded truth- 
table reductions is still an open problem, but a measure-theoretic assumption yields very strong 
consequences. Lutz and Zhao E2| showed that under the hypothesis “NP does not have p-measure 
0,” every <^a_'p-hard set for NP must be dense, for all a < 1. In Section IE) we present the 
same conclusion under the weaker hypothesis “NP has positive p-dimension,” and some additional 
consequences. 

2 Preliminaries 

The set of all binary strings is {0,1}*. The length of a string x G {0,1}* is |x|. We write A for the 
empty string. For re G N, {0,1}*^ is the set of strings of length re and {0,1}-*^ is the set of strings 
of length at most re. 

A language is a subset L C {0,1}*. We write L<n = L n {0,1}-” and L=n = L n {0,1}"'. We 
say that L is sparse if there is a polynomial p{n) such that for all re, \L=n\ < p{n). We say that L 
is (exponentially) dense if there is a constant e > 0 such that |L<n| > 2 ”' for all sufficiently large 
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n. We write SPARSE and DENSE for the classes of all sparse languages and all dense languages. 
The complement DENSE'^ of DENSE is the class of all nondense languages. 


2.1 Polynomial-Time Reductions 

We use standard notions of polynomial-time reducibilities: 

• Turing reducibility: A <!^ B if there is a polynomial-time oracle machine M such that A = 
L{M^). 

• Truth-table reducibility: A B if there is a polynomial-time oracle machine M that makes 
nonadaptive queries such that A = L(M^). 

• Disjunctive reducibility: A B if there is a polynomial-time computable / : {0,1}* —> 
P({0,1}*) such that for all x, x G A if and only if f{x) fl R / 0. 

• Conjunctive reducibility: A <c R if there is a polynomial-time computable / : {0,1}* —> 

P({0,1}*) such that for all x, x G Aii and only if f{x) C B. 

We write or to indicate that the reduction makes at most q{n) queries on any 

input of length n. The bounded reducibility A B means A B for some constant k. 

Let <P be a polynomial-time reducibility. For any language B, we define Pt(.B) = {A \ A <? B}. 
A language B is <r-hard for a class C if C C Ft{B). For any class V of languages, Pr(P) = 

Usei> 

2.2 Resource-Bounded Measure and Dimension 

Resource-bounded measure and dimension were introduced in nzumin. Here we briefly review the 
definitions and basic properties. We refer to the original sources and also the surveys HHumis] 
for more information. 

The Cantor space is C = {0,1}“. Each language A C {0,1}* is identified with its characteristic 
sequence G C according to the standard (lexicographic) enumeration of {0,1}*. We typically 
write A in place of Xa- this way a complexity class C C ^({0,1}*) is viewed as a subset C C C. 

We use the notation S (n to denote the first n bits of a sequence S G C. 

Let s > 0 be a real number. An s-gale is a function d : {0,1}* —> [0,oo) such that for all 
u;G{0,1}*, 

d{wf)) -G d{wl) 

d{w) = --. 

A martingale is a 1-gale. 

The goal of an s-gale is to obtain large values on sequences: 

Definition. Let d be an s-gale and S G C. 

1. d succeeds on S if limsupd(5 (n) = oo. 

n^oo 

2. d succeeds strongly on S if liminf d(S' fn) = oo. 

3. The success set of d is = {S' G C | d succeeds on S'}. 
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4. The strong success set of d is = {S' € C | d succeeds strongly on S}. 

Notice that the smaller s is, the more difficult it is for an s-gale to obtain large values. Succeeding 
martingales (s = 1) imply measure 0, and the infimum s for which an s-gale can succeed (or strongly 
succeed) gives the dimension (or strong dimension): 

Definition. Let X C C. 

1. X has p-measure 0, written Hp{X) = 0, if there is a polynomial-time computable martingale 
d such that X C S°^ [d] . 

2. The p-dimension of X, written dimp(X), is the infimum of all s such that there exists a 
polynomial-time computable s-gale d with X C S'‘^[d]. 

3. The strong p-dimension of X, written Dimp(X), is the infimum of all s such that there exists 
a polynomial-time computable s-gale d with X C 5^[d]. 

We now summarize some of the basic properties of the p-dimensions and p-measure. 

Proposition 2.1. ([Umi) Let X,Y C c. 

1. 0 < dimp(X) < Dimp(X) < 1. 

2. //dimp(X) < 1, then tip{X) = 0. 

3. If X <ZY, then dimp(X) < dimp(y) and Dimp(X) < Dimp(y). 

The following theorem indicates that the p-dimensions are useful for studies within the com¬ 
plexity class E. 

Theorem 2.2. f [17L ITTil 

1. /rp(E) 7 ^ 0. In particular, dimp(E) = Dimp(E) = 1. 

2. For all c G N, Dimp(DTIME(2'^"')) = 0. 

2.3 Online Mistake-Bonnd Model of Learning 

A concept is a set C C U, where U is some universe. A concept C is often identified with its 
characteristic function fc-U^ {0,1}. In this paper the universe is always a set of binary strings. 
A concept class is a set of concepts C C V{U). 

Given a concept class C and a universe U, a learning algorithm tries to learn an unknown target 
concept C € C. The algorithm is given a sequence of examples xi, X 2 , ... in U. When given 
each example Xi, the algorithm must predict if Xj G C or Xj 0 C. The algorithm is then told the 
correct answer and given the next example. The algorithm makes a mistake if its prediction for 
membership of Xj in C is wrong. This proceeds until every member of U is given as an example. 

The goal is to minimize the number of mistakes. The mistake bound of a learning algorithm 
A for a concept class C is the maximum over all C G C of the number of mistakes A makes when 
learning C, over all possible sequences of examples. The running time of A on C is the maximum 
time A takes to make a prediction. 
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2.4 Disjunctions and Winnow 

An interesting concept class is the class of monotone disjunctions, which can be efficiently learned 
by Littlestone’s Winnow algorithm cni- A monotone disjunction on {0,1}"" is a formula of the 
form (/ly = y^^yXi, where V C n} and we write a string x G {0,1}” as x = Xi---Xn- 

The concept 4>v can also be viewed as the set {x € {0,1}"' | (pvi^) = 1} or equivalently as 
{AC{l,...,n} I Anl//0}. 

The Winnow algorithm has two parameters a (a weight update multiplier) and 9 (a threshold 
value). Initially, each variable Xi has a weight Wi = 1. To classify a string x, the algorithm predicts 
that X is in the concept if WiXi > 6, and not in the concept otherwise. The weights are updated 
as follows whenever a mistake is made. 

• If a negative example x is incorrectly classified, then set Wi := 0 for all i such that x* = 1. 
(Certainly these Xj’s are not in the disjunction.) 

• If a positive example x is incorrectly classified, then set Wi := a- wi for all i such that Xj = 1. 
(It is considered more likely that these Xj’s are in the disjunction.) 

A useful setting of the parameters is a = 2 and 9 = n/2. With these parameters, Littlestone proved 
that Winnow will make at most 2A:logn + 2 mistakes when the target disjunction has at most k 
literals. Also, the algorithm uses 0{n) time to classify each example and update the weights. 

3 Learning and Dimension 

In this section we present a framework relating online learning to resource-bounded dimension. 
This framework is based on reducibility to learnable concept class families. 

Definition. A sequence C = (Cn | n G N) of concept classes is called a concept class family. 

We consider two types of reductions: 

Definition. Let L C {0,1}*, C = {Cn | n G N) be a concept class family, and r{n) be a time bound. 

1. We say L strongly reduces to C in r{n) time, and we write L C, if there exists a sequence 
of target concepts {cn G | n G N) and a reduction / computable in 0{r{n)) time such that 
for all but finitely many n, for all x G {0,1}", x G L if and only if /(x) G Cn- 

2. We say L weakly reduces to C in r{n) time, we write L C if there a reduction / computable 
in 0(r(n)) time such that for infinitely many n, there is a concept Cn G Cn such that for all 
X G {0,1}-", X G T if and only if /(O", x) G c„. 

It is necessary to quantify both the time complexity and mistake bound for learning a concept 
class family: 

Definition. Let t, m : N —> N and let C = {Cn | n G N) be a concept class family. We say that 
C G C{t,m) if there is an algorithm that learns Cn in 0{t{n)) time with mistake bound m{n). 

Combining the two previous definitions we arrive at our central technical concept: 

Definition. Let r, t, w, : N —> N. 
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1. TZCstr{r, t,m) is the class of all languages that <^^^-reduce to some concept class family in 
£.{t, m). 

2. is the class of all languages that <^^-reduce to some concept class family in 
C{t, m). 

A remark about the parameters in this definition is in order. If A G TZCstj-{r,t,m), then A C 
for some concept class family C = (Cn | n G N). Then x G if and only if f{x) G Cn, where 
Cn & Cn is the target concept and / is the reduction. We emphasize that the complexity of learning 
Cn is measured in terms of n = |x|, and not the size of Cn or f{x). Instead Cn is learnable in time 
0{t{n)) with mistake bound m{n). 

The following theorem is the main technical tool in this paper. Here we consider exponential¬ 
time reductions to concept classes that can be learned in exponetial-time, but with subexponentially- 
many mistakes. 

Theorem 3.1. Let c G N. 

1. 7^£str(2^"', 2'^"', o(2"')) has strong ]i-dimension 0. 

2. 7^£wk(2‘^"', 2'^"', o(2"')) has p-dimension 0. 

Proof. We only prove that TZC^\^{2^^, 2^^, o(2"')) has p-dimension 0. The other part of the theorem 
is proved similarly. Let s > 0 such that 2^ is rational. It suffices to show that the class has 
p-dimension at most s. 

Let A G 7^£wk(2‘^"', 2'^"', o(2"')). Then there is a concept class family C = {Cn | n G N} G 
11(2'^"', o(2"')) such that A C. Let / be this reduction from A to C. The for infinitely many n, 
there is a target concept Cn G Cn such that 

X G A<n f{x) G Cn- 

Let J be the set of all n such that this concept exists. Let A be a 2‘^"'-time learning algorithm for 
C with mistake bound o(2"'). 

Fix an n and let N = — I. We view the reduction / as generating a sequence of examples 

f{so),f{si ),... ,/(sAr), 

one for each string in {0,1}-"'. The idea is to run the algorithm A on this sequence of examples, 
trying to learn Cn- We will use A’s predictions to define an s-gale dn inductively as follows. 

1. Let A^o = 2”/^. For all strings w with |t(;| < No, dn{w) = 

2. Let e be a small rational number to be determined later. Let w be any string w with Nq < 
|tc| < N. Run A on the sequence of examples /(stvo); • • • > /(s|to|), telling A that for each i, 
Nq < i < \w\, 

- If w\i\ = 1, then f{si) is a positive example. 

- If w\i\ = 0, then f{si) is a negative example. 

At the end A will output a prediction for /(s|^|). 
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• If ^ predicts that /(s|u,|) is a member of the target concept c„,, then we let 

— dn{wl) = 2^{1 - e)d{w), 

— dn{wO) = 2^ed{w). 

• Otherwise, A predicts that /(s|^|) is not a member of the target concept c„, and we let 

— dn{wO) = 2^(1 — e)d{w), 

— dn{wl) = 2^ed{w). 

3. For w with |t(;| > N, we set dn{wO) = dn{wl) = 2^^~^^dn{w). 

The reason for making dn wait until Nq to bet is computational efficiency. For |t(;| < Nq, dn{w) is 
computable in 0{\w\) time. If |t(;| > Nq, then to compute dn{w) we need to execute A on at most 
|t(;| examples, each execution taking 0 ( 2 '^”') time to compute the example and 0 ( 2 '^"') to compute 
the label, for a total time of 0(|tc|2^”'). Because |ri;| > 2"'/^, this simplifies to 

Each time A makes a correct prediction, the value of the s-gale is increased by a 2'^(1 — e) 
factor. When A makes a mistake, the value is multiplied by 2®e. Let Wn be the length N prefix of 
A’s characteristic sequence and suppose that n £ J. In the computation of dn{vjn), observe that 
A is told the correct labels for the examples according to the target concept c„. Let rUn be the 
number of mistakes that A makes on this sequence of examples when learning c^; we know that 
nin = o(2"). Then 

d{wn) = • (1 - 

_ log(l-e)] + [m„ log €]-No 

y 2sN-(^Nlogj^+mnlog^-j^^-No 

We choose e G Q so that log < ■s and let 0 < <5 < s — Then since and Nq are both 
o{N), when n G J is large enough we have 

dn{wn) > 2^^. 

Let d be the s-gale d = ‘2~^dn- Then A G A standard technique is that taking the first 

|t(;| +r terms of the sum, we can approximate d{w) to precision 2 “^ in time 0 ((|r(;| +r) •max{|u)| + 
r, Such an s-gale can be defined for every set in lZC^]^i2^'^,2^'^,o{2A)). These gales are 

all computable within the same time bound, so we can apply a union lemma m to conclude that 
7^£wk(2^"', 2 '^"', o( 2 ”)) has p-dimension at most s. □ 

4 Disjunctive and Conjunctive Reductions 

In this section, as a warmup to our main theorem, we present two basic applications of Theorem 
o First, we consider disjunctive reductions. 

Theorem 4.1. Pd(DENSE‘^) has p-dimension 0. 

Proof. We will show that Pd(DENSE'=) C 7^£wk(22’^, 2‘^^, o{2^)). Eor this, let A G Pd(DENSE^) be 
arbitrary. Then there is a set S' G DENSE^ and a reduction / : {0,1}* ^ 1}*) computable 

in polynomial time p{n) such that for all x G {0,1}*, x G A if and only if f{x) n S 7 ^ 0. Note that 


on an input of length n, all queries of / have length bounded by p{n). Also, since S is nondense, 
for any e > 0 there are infinitely many n such that 

l%p(n)l<2"^ (4.1) 

Let Qn = U|a;|<n/(^) queries made by / up through length n. Then \Qn\ < 

2”'+^p(n). Enumerate Qn as qi,...,qN- Then each subset of i? C Qn can be identified with its 
characteristic string Xr £ {0,1}'^ according to this enumeration. We define Cn to be the concept 
class of all monotone disjunctions on {0,1}'^ that have at most 2"'' literals. Our target disjunction 
is 

4^n — Qij 
i-.qi&S 

which is a member of Cn whenever gu holds. For any x € {0,1}-”, 

X^A 0n(X/(,)) = 1- 

Given x, can be computed in 0(2^”') time. Therefore A C = {Cn \ n G N). 

Since Winnow learns Cn making at most 2 • 2”''log|Qn| + 2 = o(2"') mistakes, it follows that 

A G 7^£wk(22",22^,o(2’^)). □ 

Next, we consider conjunctive reductions. 

Theorem 4.2. Pc (DENSE‘S) has p-dimension 0. 

Proof. We will show that Pc(DENSE^) C 7^/:wk(2", 2^", o(2")). For this, let A <)? 5 G DENSEE 
Then there is a reduction / : {0,1}* ^ '^({0,1}*) computable in polynomial time p{n) such that 
for all X G {0,1}*, x G A if and only if /(x) C S. 

Fix an input length n, and let Qn = U|a;|<n fix)- Let e > 0 and consider the concept class 

Cn = {ViX) \ XCQn and |A| < 2”'}. 

Our target concept is 

Cn = V{SnQn). 

For infinitely many n, \S O Qn\ < |5'<p(n)| < 2"'", in which case Cn G Cn- For any x G {0,1}-”, we 
have 

X G A /(x) G Cn- 

Therefore A C = {Cn | n G N). 

The class Cn can be learned by a simple algorithm that makes at most |A| mistakes when 
learning V{X). The hypothesis for X is simply the union of all positive examples seen so far. More 
explicitly, the algorithm begins with the hypothesis H = $. In any stage, given an example Q, the 
algorithm predicts ‘yes’ if Q C LI and ‘no’ otherwise. If the prediction is ‘no,’ but Q is revealed to 
be a positive example, then the hypothesis is updated as H := H \J Q. The algorithm will never 
make a mistake on a negative example, and can make at most |A| mistakes on positive examples. 

This algorithm shows that C G T(2^”', o(2”')), so A G TZC^]^{p{n),2'^'^,o{2"‘)). It follows that 
Pc (DENSE") C 7^/:„k(2’", 22*", 2”'). □ 

Since dimp(E) = 1, we have new proofs of the following results of Watanabe. 

Corollary 4.3. (Watanabe [2Z1) E ^ Pd(DENSE") and E % Pc(DENSE"). That is, every <l-hard 
or <c-hard set for E is dense. 
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5 Adaptive Reductions 


In this section we prove our main theorem, which concerns adaptive reductions that make a sub- 
linear number of queries to a nondense set. It turns out that this problem can also be reduced to 
learning disjunctions. 

In a surprising result (refuting a conjecture of Ko jl4ji. Allender, Hemaspaandra, Ogiwara, 
and Watanabe P showed that Pbtt (SPARSE) C P^(SPARSE). The disjunctive reduction they 
obtain will not be polynomial-time computable if the original reduction has more than a constant 
number of queries. However, in the proof of the following theorem we are still able to exploit 
their technique, and obtain an exponential-time reduction to a disjunction. Then we can apply the 
Winnow algorithm as in the previous section. 

Theorem 5.1. For all a < 1, P„a_T(DENSE'^) has p-dimension 0. 

Proof. Let L S G DENSE'^ via some oracle machine M. We will show how to reduce L to 

a class of disjunctions. 

Fix an input length n. For an input x G {0,1}-”, consider using each z G {0,1}”'* as the 
sequence of yes/no answers to M’s queries. Each z causes M to produce a sequence of queries 
,..., where k{x., z) < n", and an accepting or rejecting decision. Let C {0,1}” be 

the set of all query answer sequences that cause M to accept x. Then we have x G L ii and only if 

(3z G ^x)(V0 <j< k{x,z)) S[wf''] = z[j], 


which is equivalent to 


{3z G Z^)(y0 <j< k{x, z)) z[j] ■ G S'" © S', 

where S"’’ © 5 is the disjoint union {Ox | x G S"^} U {lx | x G S}. 

A key part of the proof that Pbtt (SPARSE) C Pd(SPARSE) in pQ is to show that Pi_tt (SPARSE) 
is contained in Pd(SPARSE). The same argument yields that 

Pi_tt(DENSE=) C Pd(DENSE'^). 

Therefore, there is a set U G DENSE'’ such that (B S <(( U. Letting g be this polynomial-time 
disjunctive reduction, we have x G L if and only if 

{3z G Za;)(V0 < j < /c(x, z)) g{z\j] ■ rcj’^) n 1/ 7 ^ 0. 

For each z G Z^, let 

Hx,z = {(tto, • • •, Uk(^x,z)) I (VO < j < /c(x, z)) Uj G g{z[j] ■ wj’^)}. 

Define 

= {('UQ) • • • 5 Uk) \ k < •nP and (VO < j <k) Uj G U}. 

Then we have x G L if and only if 


{3z G Zx){3v G Hx,z) V G A^. 
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Letting 


Hx — Hx^z") 
z^Zx 

we can rewrite this as 

xGL HxDAn^^. (5-1) 

Let r(n) be a polynomial bounding the number of queries g outputs on an input of form z[j] , 
where |x| < n. Then \Hx^z\ < r{n)^°‘, so 




(5.2) 


Also, 


\An\ < n“ ■ \U<r{n) 


Let e G (0,1 — a), and let 6 G (a + e, 1). Then since U is nondense, for infinitely many n, we have 
\U<r{n) \ < This implies 

(3°°n) \An\ < n“ • 2^“^' < 2”'. (5.3) 


Let 

Hn= U i/x. 

xG{0,1}^" 


Then from Q, l^nl < 2^"' if n is sufficiently large. 

Enumerate Hn as hi,-- - ,/i 7 v- We identify any R C Hn with its characteristic string G 
{0,1}^ according to this enumeration. Let Cn be the concept class of all monotone disjunctions on 
{0,1}^ that have at most 2"''^ literals. 

Define the disjunction 

071 — \J hi, 

l-hiGAn 


(l)n = V which by (I5.3|l is in Cn for infinitely many n. For any x G {0,1}-”, from (|5.1B it 

i:hi^A.n 

follows that 

X G L 0n(x£^) = 1- 

Given x G {0,1}-”, we can compute iii 0{2'^ -poly{n) + \Hn\) time. Therefore, letting C = {Cn \ 

n G N), we have L C. Since C„ is learnable by Winnow with at most 2-2"'* - log \Hn\ + 2 = 0 ( 2 ”) 
mistakes, it follows that L G 7^Twk(2^"'7 2^"'; o(2"'))- D 


As a corollary, we have a positive answer to the question of Lutz and Mayordomo m mentioned 
in the introduction: 


Corollary 5.2. For all a < 1, P„c._'p(DENSE^) has ^-measure 0. 

Corollary 5.3. For all a < 1, F ^ P„a_x(DENSE‘^). That is, every <'^a_-j--hard set for E is 
dense. 


If we scale down from nondense sets to sparse sets, the same proof technique can handle more 
queries. 
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Theorem 5.4. (SPARSE) has strong ^-dimension 0. 

Proof. Let L S G SPARSE via some oracle machine M, where /(n) = o(re/logn). 

Fix an input length n. For an input x G {0,1}”, each query answer sequence z G {0, 
causes M to produce a sequence of queries Wq ’^,..., where k{x, z) < f{n), and an accepting 

or rejecting decision. Let Zx C {0, be the set of all query answer sequences that cause M to 

accept X. Then we have x G L if and only if 

(3z G Zx)(y0 < J < k{x, z)) z[j] ■ ® S. 

Since Pi_tt(SPARSE) C Pj(SPARSE), there is a set 17 G SPARSE such that ® S <l U. 
Letting g be this polynomial-time disjunctive reduction, we have x G L if and only if 

{3z G Zx)(f^^ <j< fc(x, z)) g{z[j] ■ wj’^) nU 

As before, we can define sets Hx and An so that 

X G L Hx n An 7^ 0. 

Let r(n) be a polynomial bounding the number of queries g outputs on an input of form z[j] 
where |x| = n. Then 

1^x1 < \Zx\-r{ny^^^ < ”(”)), 

so we have \Hx\ < 2” if n is sufficiently large because /(n) = o(n/logn) . Letting Hn = 

Uxe{0,l}" \^ri\ < 2^”. 

Also, 

|A„|</(n)-|C/<,(„)K(”). 

Let q{n) be a polynomial such that |?7<r-(n)l < Q'(^) for all n. Then 

|A„| < /(n) • g(n)^fo) < 2 ^^)'°g5(fo+i°g/W, 

Let v{n) = f{n)\ogq{n) -|-log/(n). Notice that v{n) = o(n) because /(n) = o(n/logn). 

As before, we enumerate Hn as hi,-- - ,/i 7 v and identify any R C Hn with its characteristic 
string £ {0,1}'^. Let Cn be the concept class of all monotone disjunctions on {0,1}-^ that 
have at most 2”^^ literals. The disjunction (fn = V hi, is in Cn for every n. For any x G {0,1}”, 

i'-hiGAn 

we have 

X G L ^ = 1. 

Given x G {0,1}”, we can compute fo 0(2” • poly(n) -|- \Hn\) time. Therefore, letting C = (Cn \ 
n G N), we have L C. Since Cn is learnable by Winnow with at most 2-2”fo) dog |i7„|-|-2 = o(2”) 
mistakes, it follows that L G 7^Tstr(2^”, 2^”, o(2”)). □ 

The following corollary improves the result of Fu |Hj that E ^ Po(n/iogn)-T(TALLY). 

Corollary 5.5. E g Po(n/iogn)-T(SPARSE). 
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Since Wilson constructed an oracle relative to which E C (SPARSE) [231101) Corollary 15.51 

is near the limits of relativizable techniques. 

In Theorem E31 we used strong dimension, which raises a technical point. The results about 
reductions to DENSE'^ cannot be strengthened to strong p-dimension simply because the class 
DENSE^ itself has strong dimension 1. This is because being nondense is an infinitely-often property 
|2]. However, if we replace DENSE‘S by SPARSE in any of our results, the proofs can be adapted 
to show that the resulting class has strong p-dimension 0. We can also obtain strong dimension 
results by substituting the larger class DENSEf ^ , where DENSEi.o, is the class of all L that satisfy 
{3e > 0){3°°n) \L<n\ > 

6 Hard Sets for NP 

The hypothesis “NP has positive p-dimension,” written dimp(NP) > 0, was first used in m to 
study the inapproximability of MAX3SAT. This positive dimension hypothesis is apparently much 
weaker than Lutz’s often-investigated ^p(NP) / 0 hypothesis, but is a stronger assumption than 
P / NP: 

/rp(NP) / 0 ^ dimp(NP) = 1 ^ dimp(NP) > 0 ^ P 7 ^ NP. 

The measure hypothesis /ip(NP) 7 ^ 0 has many plausible consequences that are not known to follow 
from P 7 ^ NP (see e.g. |21|1. So far few consequences of dimp(NP) > 0 are known. The following 
corollary of our results begins to remedy this. 

Theorem 6.1. /f dimp(NP) > 0, then every set that is hard for NP under <^-reductions, <c- 
reductions, or -reductions (a < 1) is dense, and every set that is hard under 

reductions is not sparse. 

The consequences in Theorem ib.ll are much stronger than what is known to follow from P 7 ^ NP. 
If P 7 ^ NP, then no <{^^^-hard or <c-hard set is sparse 12313, but it is not known whether hard 
sets under disjunctive reductions or unbounded Turing reductions can be sparse. 

Another result is that if NP 7 ^ RP, then no <((-hard set for NP is sparse 1313- It is interesting 
to see that while the hypotheses dimp(NP) > 0 and NP 7 ^ RP are apparently incomparable, they 
both have implications for the density of the disjunctively-hard sets for NP. 

7 Conclusion 

Our connection between online learning and resource-bounded dimension appears to be a powerful 
tool for computational complexity. We have used it to give relatively simple proofs and improve¬ 
ments of several previous results. 

An interesting observation is that for all reductions <r for which we know how to prove “every 
<r-hard set for E is dense,” by the results presented here we can actually prove “Pt-(DENSE'^) has 
p-dimension 0.” Indeed, we have proven the strongest results for Turing reductions in this way. 
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